-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explore moe #26
base: main
Are you sure you want to change the base?
Explore moe #26
Conversation
…sy compression function to support moe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 10 out of 22 changed files in this pull request and generated 7 suggestions.
Files not reviewed (12)
- notebooks/explore.ipynb: Evaluated as low risk
- deltazip/modeling/_base.py: Evaluated as low risk
- deltazip/modeling/init.py: Evaluated as low risk
- deltazip/init.py: Evaluated as low risk
- deltazip/modeling/auto.py: Evaluated as low risk
- deltazip/modeling/_utils.py: Evaluated as low risk
- deltazip/lossless/compressor.py: Evaluated as low risk
- deltazip/modeling/_const.py: Evaluated as low risk
- deltazip/core/sparsegpt.py: Evaluated as low risk
- notebooks/playground.py: Evaluated as low risk
- deltazip/modeling/gpt_neox_moe.py: Evaluated as low risk
- deltazip/modeling/moe/base_generation_strategies.py: Evaluated as low risk
Comments skipped due to low confidence (4)
cli/chat_moe.py:27
- The
args
object is not defined within thechat
function. Usemodel_path
instead.
delta_model = AutoDeltaZipModelForCausalLM.from_compressed(args.model_path, strict=True, device="cpu", unpack=True, trust_remote_code=True)
cli/chat_moe.py:49
- The
device
parameter inTextGenerationPipeline
should be an integer, not atorch.device
object. Usedevice=0
instead.
delta_model.to(torch.device("cuda"))
cli/compress_moe.py:99
- The assertion should be
assert base_model is None or is_moe, "You can only compress a moe without a base representation."
.
assert base_model is None and is_moe, "You can only compress a moe without a base representation."
cli/compress_moe.py:105
- [nitpick] The repeated block of code for loading the model should be refactored to avoid duplication.
if args.target_model == "gpt_neox_moe": ... elif args.target_model == "llama_moe": ... else: ...
delta_model = None | ||
config=None | ||
if model_type == "gpt-neox-moe": | ||
with open(f"{args.model_path}/base/base_model/config.json", "r") as fp: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable 'args' is used instead of 'model_path'. It should be 'model_path' instead of 'args.model_path'.
with open(f"{args.model_path}/base/base_model/config.json", "r") as fp: | |
with open(f"{model_path}/base/base_model/config.json", "r") as fp: |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
base_weights = load_file(f"{model_path}/base/base_weights.safetensors") | ||
|
||
delta_model = AutoDeltaZipModelForCausalLM.from_compressed( | ||
args.model_path, strict=True, device="cpu", unpack=True, trust_remote_code=True, model_config=config, custom_model = delta_model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable 'args' is used instead of 'model_path'. It should be 'model_path' instead of 'args.model_path'.
args.model_path, strict=True, device="cpu", unpack=True, trust_remote_code=True, model_config=config, custom_model = delta_model | |
model_path, strict=True, device="cpu", unpack=True, trust_remote_code=True, model_config=config, custom_model = delta_model |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
logger.info("Tokenizer loaded") | ||
|
||
logger.info("Loading base_model") | ||
base_model = transformers.AutoModelForCausalLM.from_pretrained(f"{model_path}/base/base_model.pt", trust_remote_code=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The path should be a directory containing model configurations, not a .pt file.
base_model = transformers.AutoModelForCausalLM.from_pretrained(f"{model_path}/base/base_model.pt", trust_remote_code=True) | |
base_model = transformers.AutoModelForCausalLM.from_pretrained(f"{model_path}/base", trust_remote_code=True) |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
print(f"base_weights: {base_weights.keys()}") | ||
|
||
for expert_name, expert_weight in base_weights.items(): | ||
prefix, suffix = expert_name.split(EXPERT_ID_PLACEHOLDER) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The split
method might fail if EXPERT_ID_PLACEHOLDER
is not found in expert_name
. This should be handled properly.
prefix, suffix = expert_name.split(EXPERT_ID_PLACEHOLDER) | |
if EXPERT_ID_PLACEHOLDER not in expert_name: continue |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
from deltazip.modeling._const import EXPERT_ID_PLACEHOLDER | ||
from loguru import logger | ||
from safetensors.torch import save_file | ||
import safetensors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The safetensors
module is imported but not used directly. Use it directly or remove the import.
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
# Make sure we only save the non-fc layers (i.e the layers where MoE isn't applied) | ||
for name in to_remove: | ||
del sd[name] | ||
model.save_pretrained(f"{args.outdir}/base/base_model", state_dict=sd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The save_pretrained
method does not accept a state_dict
parameter. Set the state_dict
directly on the model before calling save_pretrained
.
model.save_pretrained(f"{args.outdir}/base/base_model", state_dict=sd) | |
model.load_state_dict(sd) |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
layer_type = "LlamaDecoderLayer" | ||
layers_block_name = "model.layers" | ||
outside_layer_modules = ["model.embed_tokens", "model.norm"] | ||
inside_layer_modules = [f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.up_proj", f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.gate_proj", f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.down_proj"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inside_layer_modules
attribute is defined as a tuple due to the trailing comma. It should be a list to avoid potential issues.
inside_layer_modules = [f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.up_proj", f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.gate_proj", f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.down_proj"], | |
inside_layer_modules = [f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.up_proj", f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.gate_proj", f"moe.mlp.{EXPERT_ID_PLACEHOLDER}.down_proj"] |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
No description provided.